Numerical Field Extraction in Handwritten Incoming Mail Documents

نویسندگان

  • Guillaume Koch
  • Laurent Heutte
  • Thierry Paquet
چکیده

In this communication, we propose a method for the automatic extraction of numerical fields in handwritten documents. The approach exploits the known syntactic structure of the numerical field to extract, combined with a set of contextual morphological features to find the best label of each connected component. Applying an HMM based syntactic analyzer on the overall document allows to localize/extract fields of interest. Reported results on the extraction of zip codes, phone numbers and customer codes from handwritten incoming mail documents demonstrate the interest of the proposed approach.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Numerical Sequence Extraction in Handwritten Incoming Mail Documents

In this communication, we propose a method for the automatic extraction of numerical fields in handwritten documents. The approach exploits the known syntactic structure of the numerical field to extract, combined with a set of contextual morphological features to find the best label to each connected component. Applying an HMM based syntactic analyzer on the overall document allows to localize...

متن کامل

Segmentation-Driven Recognition Applied to Numerical Field Extraction from Handwritten Incoming Mail Documents

In this paper, we present a method for the automatic extraction of numerical fields (zip codes, phone numbers, etc.) from incoming mail documents. The approach is based on a segmentation-driven recognition that aims at locating isolated and touching digits among the textual information. A syntactical analysis is then performed on each line of text in order to filter the sequences that respect a...

متن کامل

Recognition-based vs syntax-directed models for numerical field extraction in handwritten documents

In this article, two different strategies are proposed for numerical field extraction in weakly constrained handwritten documents. The first extends classical handwriting recognition methods, while the second is inspired from approaches usually chosen in the field of information extraction from electronic documents. The models and the implementation of these two opposed strategies are described...

متن کامل

Discrimination Between Digits and Outliers in Handwritten Documents Applied to the Extraction of Numerical Fields

In this article, we propose a numerical field extraction system from unconstrained handwritten documents. The system is based on a segmentation driven by recognition stage followed by a syntactical analysis which detects the sequences that may compose a numerical field. We focus here on the design of a digit classifier embedded in the segmentation/recognition process able to discriminate digits...

متن کامل

Localisation of Numerical Date Field in an Indian Handwritten Document

This paper describes a method to localise all those areas which may constitute the date field in an Indian handwritten document. Spatial patterns of the date field are studied from various handwritten documents and an algorithm is developed through statistical analysis to identify those sets of connected components which may constitute the date. Common date patterns followed in India are consid...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2003